机器学习提供了一个令人兴奋的机会,可以改善高能物理探测器中几乎所有重建对象的校准。但是,机器学习方法通常取决于训练过程中使用的示例的光谱,这是一个称为先前依赖性的问题。这是校准的不良属性,需要适用于各种环境。本文的目的是明确强调某些基于机器学习的校准策略的先前依赖性。我们展示了基于仿真和基于数据的校准的最新建议如何继承用于培训的样本的属性,这可能会导致下游分析的偏见。在基于仿真的校准的情况下,我们认为我们最近提出的高斯ANSATZ方法可以避免先前依赖性的某些陷阱,而先前独立的基于数据的基于数据仍然是一个开放的问题。
translated by 谷歌翻译
Experience management is an emerging business area where organizations focus on understanding the feedback of customers and employees in order to improve their end-to-end experiences. This results in a unique set of machine learning problems to help understand how people feel, discover issues they care about, and find which actions need to be taken on data that are different in content and distribution from traditional NLP domains. In this paper, we present a case study of building text analysis applications that perform multiple classification tasks efficiently in 12 languages in the nascent business area of experience management. In order to scale up modern ML methods on experience data, we leverage cross lingual and multi-task modeling techniques to consolidate our models into a single deployment to avoid overhead. We also make use of model compression and model distillation to reduce overall inference latency and hardware cost to the level acceptable for business needs while maintaining model prediction quality. Our findings show that multi-task modeling improves task performance for a subset of experience management tasks in both XLM-R and mBert architectures. Among the compressed architectures we explored, we found that MiniLM achieved the best compression/performance tradeoff. Our case study demonstrates a speedup of up to 15.61x with 2.60% average task degradation (or 3.29x speedup with 1.71% degradation) and estimated savings of 44% over using the original full-size model. These results demonstrate a successful scaling up of text classification for the challenging new area of ML for experience management.
translated by 谷歌翻译